Day 16 - Regular expressions -
Multiple matches
– A date gives you a corsage, not a multiple fracture.
Little Shop of Horrors (1986)
Well, not bad at all! We are still alive after 3 lessons about something that is considered an advanced
topic. Congrats! I hope you are not only surviving, but actually enjoying the journey. I think you
start to appreciate that regular expressions are not actually difficult, they are however complicated,
full of special symbols and rules. So far we learned how to use . for any character, square brackets
[ and ] for classes and ranges with - inside them, and finally the two anchors ^ and $.
Today we’ll have a look at multiple matches. Generally speaking a multiple match is a repeated
match of a previous regular expressions, and typical use case is when you need to match a specific
number of digits or letters, but multiple matches can also be less specific, for example matching an
indefinite number of lowercase letters.
Let’s start with exact matches, which are performed with the syntax {N}, where N is the number of
matches . As I said, all multiple matches operations refer to a previous regular expression, so if you
write
$ grep -E "a{2}" examples.txt
aardvark
you are asking grep to match all groups of 2 adjacent characters a, as in aardvark. The number
between brackets can be any positive number, even though using 1 makes no sense, as a single
character is already a regular expression matching one repetition of it. So, while you can execute
$ grep -E "a{1}" examples.txt
and get the correct result, this is equivalent to
$ grep -E "a" examples.txt
and I personally don’t see a point in making the regular expression more complex to read introducing
the braces. If you like complicating your life try to create a social network in PHP. Wait a minute,
what you mean they did it?
The braces repeat the previous regular expression component, so the syntax a{2} is equivalent to a
literal aa. The syntax can be used to repeat more than letters, though, as they apply to any previous
component of the regular expression. This command, for example